Skip to content

Add flag evaluation metrics via OTel counter and OpenFeature Hook#11040

Open
typotter wants to merge 8 commits intomasterfrom
typo/evaluations-logging
Open

Add flag evaluation metrics via OTel counter and OpenFeature Hook#11040
typotter wants to merge 8 commits intomasterfrom
typo/evaluations-logging

Conversation

@typotter
Copy link
Copy Markdown
Contributor

@typotter typotter commented Apr 2, 2026

What Does This Do

Records a feature_flag.evaluations OTel counter metric on every flag evaluation via an OpenFeature finallyAfter hook. The hook captures all evaluation paths including type mismatches that occur above the provider level in the OpenFeature SDK pipeline.

Creates a dedicated SdkMeterProvider with an OtlpHttpMetricExporter that sends metrics directly to the DD Agent's OTLP endpoint (/v1/metrics). This avoids the agent's OTel class shading (io.opentelemetry.api.*datadog.trace.bootstrap.otel.api.*) which prevents using GlobalOpenTelemetry from the published dd-openfeature jar.

Metric attributes:

Attribute When present Value
feature_flag.key Always Flag key
feature_flag.result.variant Always Variant key (empty string if null)
feature_flag.result.reason Always Reason lowercased
error.type On error ErrorCode lowercased
feature_flag.result.allocation_key When present Allocation key from flag metadata

New files: FlagEvalMetrics.java, FlagEvalHook.java, FlagEvalMetricsTest.java, FlagEvalHookTest.java
Modified files: Provider.java (adds getProviderHooks()), ProviderTest.java, build.gradle.kts

Motivation

Evaluation metrics allow tracking how many times flags are evaluated, with which results, across sessions. This is the Java implementation of the evaluation logging spec (FFL-1942), matching the existing Python (dd-trace-py#17029) and Go (dd-trace-go#4489) implementations.

System tests: 11/17 pass. The 6 remaining failures are pre-existing DDEvaluator gaps (reason mapping, parse error codes) addressed in separate PRs (#11036, #10971).

References:

Additional Notes

  • OTel SDK dependencies (opentelemetry-sdk-metrics, opentelemetry-exporter-otlp) are compileOnly — applications must include them on the classpath for metrics to flow. Falls back to silent no-op when absent.
  • Export interval: 10s (matching Go SDK and EVALLOG.4 spec)
  • Endpoint resolution follows OTel spec: OTEL_EXPORTER_OTLP_METRICS_ENDPOINTOTEL_EXPORTER_OTLP_ENDPOINT + /v1/metricshttp://localhost:4318/v1/metrics

Contributor Checklist

  • Format the title according to the contribution guidelines
  • Assign the type: and (comp: or inst:) labels
  • Avoid using close, fix, or any linking keywords when referencing an issue

Jira ticket: FFL-1942

@typotter typotter added type: feature request tag: ai generated Largely based on code generated by an AI or LLM comp: openfeature OpenFeature labels Apr 2, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 8, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master typo/evaluations-logging
git_commit_date 1775744045 1775848507
git_commit_sha b266e2d 340f25c
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~340f25c3f4
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775850563 1775850563
ci_job_id 1586850642 1586850642
ci_pipeline_id 107187465 107187465
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-n382mipy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-n382mipy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 62 metrics, 8 unstable metrics.

scenario Δ mean execution_time candidate mean execution_time baseline mean execution_time
scenario:startup:petclinic:iast:Flare Poller better
[-341.962µs; -93.692µs] or [-9.557%; -2.618%]
3.360ms 3.578ms
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.058 s) : 0, 1057557
Total [baseline] (11.012 s) : 0, 11012465
Agent [candidate] (1.05 s) : 0, 1050401
Total [candidate] (11.103 s) : 0, 11103445
section appsec
Agent [baseline] (1.25 s) : 0, 1249693
Total [baseline] (11.248 s) : 0, 11248098
Agent [candidate] (1.247 s) : 0, 1246914
Total [candidate] (11.182 s) : 0, 11182308
section iast
Agent [baseline] (1.235 s) : 0, 1235096
Total [baseline] (11.345 s) : 0, 11344950
Agent [candidate] (1.227 s) : 0, 1227386
Total [candidate] (11.267 s) : 0, 11267254
section profiling
Agent [baseline] (1.184 s) : 0, 1184104
Total [baseline] (11.049 s) : 0, 11048747
Agent [candidate] (1.179 s) : 0, 1179349
Total [candidate] (11.014 s) : 0, 11014404
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent appsec 1.25 s 192.136 ms (18.2%)
Agent iast 1.235 s 177.54 ms (16.8%)
Agent profiling 1.184 s 126.547 ms (12.0%)
Total tracing 11.012 s -
Total appsec 11.248 s 235.633 ms (2.1%)
Total iast 11.345 s 332.485 ms (3.0%)
Total profiling 11.049 s 36.281 ms (0.3%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.05 s -
Agent appsec 1.247 s 196.513 ms (18.7%)
Agent iast 1.227 s 176.985 ms (16.8%)
Agent profiling 1.179 s 128.948 ms (12.3%)
Total tracing 11.103 s -
Total appsec 11.182 s 78.863 ms (0.7%)
Total iast 11.267 s 163.808 ms (1.5%)
Total profiling 11.014 s -89.041 ms (-0.8%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.24 ms) : 0, 1240
crashtracking [candidate] (1.218 ms) : 0, 1218
BytebuddyAgent [baseline] (633.317 ms) : 0, 633317
BytebuddyAgent [candidate] (629.712 ms) : 0, 629712
AgentMeter [baseline] (29.418 ms) : 0, 29418
AgentMeter [candidate] (29.28 ms) : 0, 29280
GlobalTracer [baseline] (248.655 ms) : 0, 248655
GlobalTracer [candidate] (247.665 ms) : 0, 247665
AppSec [baseline] (32.001 ms) : 0, 32001
AppSec [candidate] (32.01 ms) : 0, 32010
Debugger [baseline] (59.854 ms) : 0, 59854
Debugger [candidate] (60.063 ms) : 0, 60063
Remote Config [baseline] (589.54 µs) : 0, 590
Remote Config [candidate] (607.211 µs) : 0, 607
Telemetry [baseline] (8.084 ms) : 0, 8084
Telemetry [candidate] (8.076 ms) : 0, 8076
Flare Poller [baseline] (8.336 ms) : 0, 8336
Flare Poller [candidate] (5.804 ms) : 0, 5804
section appsec
crashtracking [baseline] (1.232 ms) : 0, 1232
crashtracking [candidate] (1.217 ms) : 0, 1217
BytebuddyAgent [baseline] (661.586 ms) : 0, 661586
BytebuddyAgent [candidate] (660.905 ms) : 0, 660905
AgentMeter [baseline] (12.027 ms) : 0, 12027
AgentMeter [candidate] (11.993 ms) : 0, 11993
GlobalTracer [baseline] (248.817 ms) : 0, 248817
GlobalTracer [candidate] (248.51 ms) : 0, 248510
AppSec [baseline] (185.175 ms) : 0, 185175
AppSec [candidate] (184.41 ms) : 0, 184410
Debugger [baseline] (66.779 ms) : 0, 66779
Debugger [candidate] (66.219 ms) : 0, 66219
Remote Config [baseline] (607.802 µs) : 0, 608
Remote Config [candidate] (601.133 µs) : 0, 601
Telemetry [baseline] (8.731 ms) : 0, 8731
Telemetry [candidate] (8.6 ms) : 0, 8600
Flare Poller [baseline] (3.663 ms) : 0, 3663
Flare Poller [candidate] (3.564 ms) : 0, 3564
IAST [baseline] (24.763 ms) : 0, 24763
IAST [candidate] (24.593 ms) : 0, 24593
section iast
crashtracking [baseline] (1.243 ms) : 0, 1243
crashtracking [candidate] (1.225 ms) : 0, 1225
BytebuddyAgent [baseline] (809.426 ms) : 0, 809426
BytebuddyAgent [candidate] (804.983 ms) : 0, 804983
AgentMeter [baseline] (11.533 ms) : 0, 11533
AgentMeter [candidate] (11.481 ms) : 0, 11481
GlobalTracer [baseline] (240.549 ms) : 0, 240549
GlobalTracer [candidate] (239.422 ms) : 0, 239422
AppSec [baseline] (33.998 ms) : 0, 33998
AppSec [candidate] (32.512 ms) : 0, 32512
Debugger [baseline] (58.421 ms) : 0, 58421
Debugger [candidate] (59.713 ms) : 0, 59713
Remote Config [baseline] (553.577 µs) : 0, 554
Remote Config [candidate] (533.36 µs) : 0, 533
Telemetry [baseline] (13.264 ms) : 0, 13264
Telemetry [candidate] (11.939 ms) : 0, 11939
Flare Poller [baseline] (3.578 ms) : 0, 3578
Flare Poller [candidate] (3.36 ms) : 0, 3360
IAST [baseline] (25.981 ms) : 0, 25981
IAST [candidate] (25.663 ms) : 0, 25663
section profiling
crashtracking [baseline] (1.183 ms) : 0, 1183
crashtracking [candidate] (1.17 ms) : 0, 1170
BytebuddyAgent [baseline] (691.069 ms) : 0, 691069
BytebuddyAgent [candidate] (688.651 ms) : 0, 688651
AgentMeter [baseline] (9.109 ms) : 0, 9109
AgentMeter [candidate] (9.085 ms) : 0, 9085
GlobalTracer [baseline] (207.228 ms) : 0, 207228
GlobalTracer [candidate] (206.278 ms) : 0, 206278
AppSec [baseline] (32.504 ms) : 0, 32504
AppSec [candidate] (32.348 ms) : 0, 32348
Debugger [baseline] (65.692 ms) : 0, 65692
Debugger [candidate] (65.447 ms) : 0, 65447
Remote Config [baseline] (585.235 µs) : 0, 585
Remote Config [candidate] (569.223 µs) : 0, 569
Telemetry [baseline] (7.924 ms) : 0, 7924
Telemetry [candidate] (7.768 ms) : 0, 7768
Flare Poller [baseline] (3.593 ms) : 0, 3593
Flare Poller [candidate] (3.547 ms) : 0, 3547
ProfilingAgent [baseline] (93.987 ms) : 0, 93987
ProfilingAgent [candidate] (93.592 ms) : 0, 93592
Profiling [baseline] (94.56 ms) : 0, 94560
Profiling [candidate] (94.155 ms) : 0, 94155
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.057 s) : 0, 1056686
Total [baseline] (8.836 s) : 0, 8835828
Agent [candidate] (1.061 s) : 0, 1060887
Total [candidate] (8.855 s) : 0, 8855459
section iast
Agent [baseline] (1.225 s) : 0, 1225135
Total [baseline] (9.571 s) : 0, 9571269
Agent [candidate] (1.229 s) : 0, 1229030
Total [candidate] (9.575 s) : 0, 9574561
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.057 s -
Agent iast 1.225 s 168.449 ms (15.9%)
Total tracing 8.836 s -
Total iast 9.571 s 735.442 ms (8.3%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.061 s -
Agent iast 1.229 s 168.143 ms (15.8%)
Total tracing 8.855 s -
Total iast 9.575 s 719.102 ms (8.1%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.247 ms) : 0, 1247
crashtracking [candidate] (1.22 ms) : 0, 1220
BytebuddyAgent [baseline] (632.91 ms) : 0, 632910
BytebuddyAgent [candidate] (633.218 ms) : 0, 633218
AgentMeter [baseline] (29.315 ms) : 0, 29315
AgentMeter [candidate] (29.77 ms) : 0, 29770
GlobalTracer [baseline] (248.217 ms) : 0, 248217
GlobalTracer [candidate] (250.698 ms) : 0, 250698
AppSec [baseline] (32.011 ms) : 0, 32011
AppSec [candidate] (32.485 ms) : 0, 32485
Debugger [baseline] (59.239 ms) : 0, 59239
Debugger [candidate] (59.702 ms) : 0, 59702
Remote Config [baseline] (594.075 µs) : 0, 594
Remote Config [candidate] (618.565 µs) : 0, 619
Telemetry [baseline] (8.076 ms) : 0, 8076
Telemetry [candidate] (8.295 ms) : 0, 8295
Flare Poller [baseline] (8.987 ms) : 0, 8987
Flare Poller [candidate] (8.764 ms) : 0, 8764
section iast
crashtracking [baseline] (1.229 ms) : 0, 1229
crashtracking [candidate] (1.247 ms) : 0, 1247
BytebuddyAgent [baseline] (803.169 ms) : 0, 803169
BytebuddyAgent [candidate] (805.781 ms) : 0, 805781
AgentMeter [baseline] (11.387 ms) : 0, 11387
AgentMeter [candidate] (11.579 ms) : 0, 11579
GlobalTracer [baseline] (239.059 ms) : 0, 239059
GlobalTracer [candidate] (239.838 ms) : 0, 239838
AppSec [baseline] (32.534 ms) : 0, 32534
AppSec [candidate] (32.677 ms) : 0, 32677
Debugger [baseline] (59.356 ms) : 0, 59356
Debugger [candidate] (58.903 ms) : 0, 58903
Remote Config [baseline] (538.236 µs) : 0, 538
Remote Config [candidate] (534.367 µs) : 0, 534
Telemetry [baseline] (12.531 ms) : 0, 12531
Telemetry [candidate] (12.785 ms) : 0, 12785
Flare Poller [baseline] (3.445 ms) : 0, 3445
Flare Poller [candidate] (3.47 ms) : 0, 3470
IAST [baseline] (25.757 ms) : 0, 25757
IAST [candidate] (25.947 ms) : 0, 25947
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master typo/evaluations-logging
git_commit_date 1775744045 1775848507
git_commit_sha b266e2d 340f25c
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~340f25c3f4
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775850812 1775850812
ci_job_id 1586850644 1586850644
ci_pipeline_id 107187465 107187465
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-m5hbigsh 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-m5hbigsh 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 17 metrics, 18 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:tracing:high_load better
[-1272.689µs; -706.709µs] or [-6.961%; -3.865%]
unsure
[-1.860ms; -0.510ms] or [-6.245%; -1.712%]
unstable
[-14.933op/s; +40.746op/s] or [-5.968%; +16.284%]
17.293ms 28.603ms 263.125op/s 18.282ms 29.789ms 250.219op/s
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.254 ms) : 1241, 1266
.   : milestone, 1254,
iast (3.191 ms) : 3150, 3232
.   : milestone, 3191,
iast_FULL (5.939 ms) : 5879, 6000
.   : milestone, 5939,
iast_GLOBAL (3.765 ms) : 3702, 3827
.   : milestone, 3765,
profiling (2.128 ms) : 2109, 2147
.   : milestone, 2128,
tracing (1.911 ms) : 1895, 1927
.   : milestone, 1911,
section candidate
no_agent (1.266 ms) : 1253, 1279
.   : milestone, 1266,
iast (3.302 ms) : 3254, 3350
.   : milestone, 3302,
iast_FULL (6.1 ms) : 6039, 6162
.   : milestone, 6100,
iast_GLOBAL (3.664 ms) : 3604, 3725
.   : milestone, 3664,
profiling (2.169 ms) : 2150, 2188
.   : milestone, 2169,
tracing (1.868 ms) : 1853, 1883
.   : milestone, 1868,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.254 ms [1.241 ms, 1.266 ms] -
iast 3.191 ms [3.15 ms, 3.232 ms] 1.938 ms (154.6%)
iast_FULL 5.939 ms [5.879 ms, 6.0 ms] 4.686 ms (373.8%)
iast_GLOBAL 3.765 ms [3.702 ms, 3.827 ms] 2.511 ms (200.3%)
profiling 2.128 ms [2.109 ms, 2.147 ms] 874.064 µs (69.7%)
tracing 1.911 ms [1.895 ms, 1.927 ms] 657.463 µs (52.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.266 ms [1.253 ms, 1.279 ms] -
iast 3.302 ms [3.254 ms, 3.35 ms] 2.036 ms (160.8%)
iast_FULL 6.1 ms [6.039 ms, 6.162 ms] 4.834 ms (381.8%)
iast_GLOBAL 3.664 ms [3.604 ms, 3.725 ms] 2.398 ms (189.4%)
profiling 2.169 ms [2.15 ms, 2.188 ms] 903.091 µs (71.3%)
tracing 1.868 ms [1.853 ms, 1.883 ms] 601.608 µs (47.5%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.347 ms) : 18160, 18535
.   : milestone, 18347,
appsec (18.781 ms) : 18592, 18969
.   : milestone, 18781,
code_origins (17.821 ms) : 17645, 17997
.   : milestone, 17821,
iast (17.685 ms) : 17513, 17858
.   : milestone, 17685,
profiling (18.142 ms) : 17966, 18319
.   : milestone, 18142,
tracing (18.649 ms) : 18463, 18836
.   : milestone, 18649,
section candidate
no_agent (18.046 ms) : 17860, 18232
.   : milestone, 18046,
appsec (18.405 ms) : 18219, 18592
.   : milestone, 18405,
code_origins (17.812 ms) : 17637, 17988
.   : milestone, 17812,
iast (17.903 ms) : 17723, 18082
.   : milestone, 17903,
profiling (18.145 ms) : 17966, 18325
.   : milestone, 18145,
tracing (17.734 ms) : 17558, 17910
.   : milestone, 17734,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.347 ms [18.16 ms, 18.535 ms] -
appsec 18.781 ms [18.592 ms, 18.969 ms] 433.156 µs (2.4%)
code_origins 17.821 ms [17.645 ms, 17.997 ms] -526.557 µs (-2.9%)
iast 17.685 ms [17.513 ms, 17.858 ms] -662.023 µs (-3.6%)
profiling 18.142 ms [17.966 ms, 18.319 ms] -204.942 µs (-1.1%)
tracing 18.649 ms [18.463 ms, 18.836 ms] 301.739 µs (1.6%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.046 ms [17.86 ms, 18.232 ms] -
appsec 18.405 ms [18.219 ms, 18.592 ms] 358.989 µs (2.0%)
code_origins 17.812 ms [17.637 ms, 17.988 ms] -233.891 µs (-1.3%)
iast 17.903 ms [17.723 ms, 18.082 ms] -143.423 µs (-0.8%)
profiling 18.145 ms [17.966 ms, 18.325 ms] 99.165 µs (0.5%)
tracing 17.734 ms [17.558 ms, 17.91 ms] -312.446 µs (-1.7%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master typo/evaluations-logging
git_commit_date 1775744045 1775848507
git_commit_sha b266e2d 340f25c
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~340f25c3f4
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1775850532 1775850532
ci_job_id 1586850647 1586850647
ci_pipeline_id 107187465 107187465
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-fhwg01tk 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-fhwg01tk 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.489 ms) : 1478, 1501
.   : milestone, 1489,
appsec (3.826 ms) : 3603, 4050
.   : milestone, 3826,
iast (2.277 ms) : 2207, 2346
.   : milestone, 2277,
iast_GLOBAL (2.313 ms) : 2243, 2383
.   : milestone, 2313,
profiling (2.114 ms) : 2059, 2169
.   : milestone, 2114,
tracing (2.099 ms) : 2045, 2153
.   : milestone, 2099,
section candidate
no_agent (1.49 ms) : 1478, 1501
.   : milestone, 1490,
appsec (3.816 ms) : 3592, 4039
.   : milestone, 3816,
iast (2.274 ms) : 2204, 2343
.   : milestone, 2274,
iast_GLOBAL (2.323 ms) : 2253, 2393
.   : milestone, 2323,
profiling (2.093 ms) : 2038, 2148
.   : milestone, 2093,
tracing (2.096 ms) : 2042, 2150
.   : milestone, 2096,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.489 ms [1.478 ms, 1.501 ms] -
appsec 3.826 ms [3.603 ms, 4.05 ms] 2.337 ms (156.9%)
iast 2.277 ms [2.207 ms, 2.346 ms] 787.211 µs (52.9%)
iast_GLOBAL 2.313 ms [2.243 ms, 2.383 ms] 823.595 µs (55.3%)
profiling 2.114 ms [2.059 ms, 2.169 ms] 624.485 µs (41.9%)
tracing 2.099 ms [2.045 ms, 2.153 ms] 609.904 µs (41.0%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.49 ms [1.478 ms, 1.501 ms] -
appsec 3.816 ms [3.592 ms, 4.039 ms] 2.326 ms (156.2%)
iast 2.274 ms [2.204 ms, 2.343 ms] 784.395 µs (52.7%)
iast_GLOBAL 2.323 ms [2.253 ms, 2.393 ms] 833.487 µs (56.0%)
profiling 2.093 ms [2.038 ms, 2.148 ms] 603.849 µs (40.5%)
tracing 2.096 ms [2.042 ms, 2.15 ms] 606.181 µs (40.7%)
Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~340f25c3f4, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.023 s) : 15023000, 15023000
.   : milestone, 15023000,
appsec (14.834 s) : 14834000, 14834000
.   : milestone, 14834000,
iast (18.346 s) : 18346000, 18346000
.   : milestone, 18346000,
iast_GLOBAL (18.334 s) : 18334000, 18334000
.   : milestone, 18334000,
profiling (14.782 s) : 14782000, 14782000
.   : milestone, 14782000,
tracing (15.261 s) : 15261000, 15261000
.   : milestone, 15261000,
section candidate
no_agent (15.52 s) : 15520000, 15520000
.   : milestone, 15520000,
appsec (15.017 s) : 15017000, 15017000
.   : milestone, 15017000,
iast (18.169 s) : 18169000, 18169000
.   : milestone, 18169000,
iast_GLOBAL (18.027 s) : 18027000, 18027000
.   : milestone, 18027000,
profiling (14.757 s) : 14757000, 14757000
.   : milestone, 14757000,
tracing (15.001 s) : 15001000, 15001000
.   : milestone, 15001000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.023 s [15.023 s, 15.023 s] -
appsec 14.834 s [14.834 s, 14.834 s] -189.0 ms (-1.3%)
iast 18.346 s [18.346 s, 18.346 s] 3.323 s (22.1%)
iast_GLOBAL 18.334 s [18.334 s, 18.334 s] 3.311 s (22.0%)
profiling 14.782 s [14.782 s, 14.782 s] -241.0 ms (-1.6%)
tracing 15.261 s [15.261 s, 15.261 s] 238.0 ms (1.6%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.52 s [15.52 s, 15.52 s] -
appsec 15.017 s [15.017 s, 15.017 s] -503.0 ms (-3.2%)
iast 18.169 s [18.169 s, 18.169 s] 2.649 s (17.1%)
iast_GLOBAL 18.027 s [18.027 s, 18.027 s] 2.507 s (16.2%)
profiling 14.757 s [14.757 s, 14.757 s] -763.0 ms (-4.9%)
tracing 15.001 s [15.001 s, 15.001 s] -519.0 ms (-3.3%)

typotter added 5 commits April 9, 2026 09:29
Record a `feature_flag.evaluations` OTel counter on every flag evaluation
using an OpenFeature `finallyAfter` hook. The hook captures all evaluation
paths including type mismatches that occur above the provider level.

Attributes: feature_flag.key, feature_flag.result.variant,
feature_flag.result.reason, error.type (on error),
feature_flag.result.allocation_key (when present).

Counter is a no-op when DD_METRICS_OTEL_ENABLED is false or
opentelemetry-api is absent from the classpath.
Replace GlobalOpenTelemetry.getMeterProvider() with a dedicated
SdkMeterProvider + OtlpHttpMetricExporter that sends metrics
directly to the DD Agent's OTLP endpoint (default :4318/v1/metrics).

This avoids the agent's OTel class shading issue where the agent
relocates io.opentelemetry.api.* to datadog.trace.bootstrap.otel.api.*,
making GlobalOpenTelemetry calls from the dd-openfeature jar hit the
unshaded no-op provider instead of the agent's shim.

Requires opentelemetry-sdk-metrics and opentelemetry-exporter-otlp
on the application classpath. Falls back to no-op if absent.

System tests: 11/17 pass. 6 failures are pre-existing DDEvaluator
gaps (reason mapping, parse errors, type mismatch strictness).
- Add explicit null guard for details in FlagEvalHook.finallyAfter()
- Add OTEL_EXPORTER_OTLP_ENDPOINT generic env var fallback with
  /v1/metrics path appended (per OTel spec fallback chain)
- Add comments clarifying signal-specific vs generic endpoint behavior
When the OTel SDK jars are not on the application classpath,
loading FlagEvalMetrics fails because field types reference
OTel SDK classes (SdkMeterProvider). This propagated as an
uncaught NoClassDefFoundError from the Provider constructor,
crashing provider initialization.

Fix:
- Change meterProvider field type from SdkMeterProvider to
  Closeable (always on classpath), use local SdkMeterProvider
  variable inside try block
- Catch NoClassDefFoundError in Provider constructor when
  creating FlagEvalMetrics
- Null-safe getProviderHooks() and shutdown() when metrics
  is null
FlagEvalHook references FlagEvalMetrics in its field declaration.
On JVMs that eagerly verify field types during class loading,
constructing FlagEvalHook outside the try/catch could throw
NoClassDefFoundError if OTel classes failed to load. Moving it
inside the try block ensures both metrics and hook are null-safe
when OTel is absent.
@typotter typotter force-pushed the typo/evaluations-logging branch from 4cb7bab to 69c5529 Compare April 9, 2026 15:30
Documents the published artifact setup, evaluation metrics
dependencies (opentelemetry-sdk-metrics, opentelemetry-exporter-otlp),
OTLP endpoint configuration, metric attributes, and requirements.
@typotter typotter marked this pull request as ready for review April 9, 2026 17:41
@typotter typotter requested a review from a team as a code owner April 9, 2026 17:41
@typotter typotter requested review from leoromanovsky and sameerank and removed request for a team April 9, 2026 17:41
System.getenv() is forbidden by the project's forbiddenApis rules.
Replace with ConfigHelper.env() which is the approved way to read
environment variables. Add config-utils as compileOnly dependency.
Copy link
Copy Markdown

@sameerank sameerank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for helping with this! I agree it was a good idea to break out the system test fixes into separate PRs to keep this one brief and focused

<artifactId>dd-openfeature</artifactId>
<version>${dd-openfeature.version}</version>
</dependency>
<dependency>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a transitive dependency of our provider, I think it might be better to skip it (otherwise customers will need to ensure compatibility).

String flagKey = details.getFlagKey();
String variant = details.getVariant();
String reason = details.getReason();
dev.openfeature.sdk.ErrorCode errorCode = details.getErrorCode();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we probably want to import dev.openfeature.sdk.ErrorCode

.setUnit(METRIC_UNIT)
.setDescription(METRIC_DESC)
.build();
} catch (NoClassDefFoundError | Exception e) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to just let the error flow to the Provider class since it's already capturing the exception?

Copy link
Copy Markdown
Member

@manuel-alvarez-alvarez manuel-alvarez-alvarez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just left a couple of minor comments

- Remove transitive openfeature-sdk dep from README setup section
- Import ErrorCode at top of FlagEvalHook instead of inline FQN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: openfeature OpenFeature tag: ai generated Largely based on code generated by an AI or LLM type: feature request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants